Concerning with On-Chip Network Features to Improve Cache Coherence Protocols for CMPs
نویسندگان
چکیده
Chip multiprocessors (CMPs) with on-chip network connecting processor cores have been pervasively accepted as a promising technology to efficiently utilize the ever increasing density of transistors on a chip. Communications in CMPs require invalidating cached copies of a shared data block. The coherence traffic incurs more and more significant overhead as the number of cores in a CMP increases. Conventional designs of cache coherence protocols do not take into account characteristics of underlying networks for flexibility reasons. However, in CMPs, processor cores and the on-chip network are tightly integrated. Exposing the network features to cache coherence protocols will unveil some optimization opportunities. In this paper, we propose distance aware protocol and multi-target invalidations, which exploit the network characteristics to reduce the invalidation traffic overhead at negligible hardware cost. Experimental results on a 16-core CMP simulator showed that the two mechanisms reduced the average invalidation traffic latency by 5%, up to 8%.
منابع مشابه
A NoC-level Support for Broadcast-based Coherence Protocols
Chip Multiprocessor Systems (CMPs) rely on a cache coherency protocol to maintain memory access coherence between cached data and main memory. The Hammer coherency protocol is appealing as it eliminates most of the space overhead when compared to a directory protocol. However, it generates much more traffic, thus stressing the NoC and having worse performance in terms of power consumption. When...
متن کاملCache Coherence Strategies for Speculative Multithreading CMPs : Characterization and Performance Study
Thread-level memory speculation is one of speculation techniques usually employed in speculative multithreading architectures. On shared-bus chip multiprocessors (CMPs), the technique can be implemented by extending their cache coherence mechanisms. Several implementations have been proposed and evaluated. However, there is no study that compares the impact of the taken strategies and identifie...
متن کاملScalable Directory Organization for Tiled CMP Architectures
Although directory-based cache coherence protocols are the best choice when designing chip multiprocessor architectures (CMPs) with tens of processor cores on chip, the memory overhead introduced by the directory structure may not scale gracefully with the number of cores. In this work, we show that a directory organization based on duplicating tags, which are distributed among the tiles of a t...
متن کاملDirectoryless shared memory architecture using thread migration and remote access
Distributed directory cache coherence protocols for current many-core CMPs are not only difficult and error-prone to implement and verify, but also provide suboptimal performance when a thread requires access to large amounts of data distributed across the chip: the data must be brought to the core where the thread is running, incurring delays and energy costs. In this paper, we propose an appr...
متن کاملCharacterization of a List-Based Directory Cache Coherence Protocol for Manycore CMPs
The development of efficient and scalable cache coherence protocols is a key aspect in the design of manycore chip multiprocessors. In this work, we review a kind of cache coherence protocols that, despite having been already implemented in the 90s for building large-scale commodity multiprocessors, have not been seriously considered in the current context of chip multiprocessors. In particular...
متن کامل